Bulk Export To CSV (Assetic Python SDK)
This article will demonstrate the use of a python script to automate the export of data from Assetic to CSV files. The underlying process that the script relies on is the bulk export functionality of Advanced Search Profiles. For an overview of using Advanced Search to bulk export data, refer to this article on the Integration page.
NOTE The sync process requires the GUID of the Search Profile used to define the export data. To obtain this identifier simply open the Advanced Search Profile in Assetic and copy the GUID part from the URL in the browser navigation bar.
In the example screenshot below the full URL is:
https://xxx.assetic.net/Search/AdvancedSearchProfile/SPDashboard/Default/ff03c4bd-4f1d-e611-9458-06edd62954d7?loadprofile=True.
The GUID therefore is: ff03c4bd-4f1d-e611-9458-06edd62954d7
NOTE It is recommended to include in the search profile the field that defines the 'Last Modified Date' of the search records. This enables the export to skip if there have been no changes to the data
The Assetic Python Bulk Export To CSV process wraps the Assetic REST API for bulk exporting search profiles.
The process needs to be scheduled to run via a scheduler such as using the Windows Task Scheduler.
Where more than 10,000 records are returned by the Search Profile, or when utilising the 'always_background_export' parameter, the export becomes a 2-step process since it utilises the Assetic background worker process to generate the export. By default, when a profile has less than 10,000 results, the export is immediate.
For search profiles that include a last modified date field, there is a parameter that allows the export to be skipped if the data hasn't changed (the actual field name for the last modified date differs between modules).
Because the export is a 2 step process it uses the Assetic Documents API to create a document link (a document of type link) in Assetic that records the details about the first step of the export. Recording the details of the first step as a document link will allow the second step to be initiated at a later point in time. An advanced search may be created to view these export process documents as a way to monitor the export process.
The CSV file is written to the Python current working directory. See the samples below for how to set the current working directory in the Python script that executes the export process.
CSV Export Steps
1. Define export parameters
The export process requires information about the search profile(s).
This definition is provided via the Assetic Python SDK object "SearchProfileRepresentation". This representation is also used in the "Sync to Local Database" process described in the article Creating a Local Backup of Assetic Data (Assetic Python SDK). The object has the following fields relevant to the export to CSV:
Parameter | Description |
profileguid | The GUID of the search profile to be used for the data export. |
profilename | The name of the search profile. The value of 'profilename' is used as the name for the exported CSV file. The name does not need to exactly match the name of the Search Profile in Assetic, but it helps to identify the profile via the user-friendly name provided. |
keyfield | The name of the key field in the search data that uniquely identifies each record in the data. This is required to assist with the 'export immediate' process where the number of records in the search is less than 10,000 records. |
background_export |
The default is 'False'. A boolean flag to specify if an individual search profile will export via a background worker process. If set to 'True', this parameter can be used to bypass the 'export immediate' process for a search profile with results of less than 10,000 records and ensure it is instead exported via a background worker. If there is a requirement to have all exports initiated via a background worker, the 'always_background_export' boolean parameter can be provided instead when initiating the 'initiate_export_task_to_file()' method, which is outlined below. |
2. Initiate the export
The export process is initiated via the Assetic Python SDK method SyncToLocalProcesses.initiate_export_task_to_file().
It has the following mandatory parameter:
profile: a SearchProfileRepresentation object instance
There are also 4 optional parameters:
hidecontroldoc: a boolean value. Default is False. If True, then once the export process is complete the Assetic document used to manage the export is removed.
doc_group: The integer Id of the Document Group to apply to the document used to record the details of this first step. Obtain Id via API endpoint GET /api/v2/document/group
doc_category: The integer Id of the Document Category to apply to the document used to record the details of this first step. Obtain Id via API endpoint GET /api/v2/document/category
doc_subcategory: The integer Id of the Document Sub Category to apply to the document used to record the details of this first step. Obtain Id via API endpoint GET /api/v2/document/category/{id}/subcategory where {id} is the doc_category Id
force_export: The default is "False". If "True" then the export takes place regardless of whether the records have changed or not. This can be used to ensure the export is initialised following a change to the search profile such as the addition of new fields.
always_background_export: The default is "False". If "True" then all profiles will export via an Assetic background worker process. This can be used to ensure that all search profile exports will have a corresponding export file, including search profiles with less than 10,000 records that would otherwise be created immediately.
Below is a sample script to initiate the export process. For search profiles with less than 10,000 records, the CSV file will be created immediately as part of this initiation process.
- """
- Assetic.BulkExportInitialiseToCSV.py
- Initialises the bulk export process to CSV.
- Define a set of search profiles to be exported
- When this py script is run it uses the Assetic 'export all' api to initiate the
- export, which uses the Assetic background worker to queue and execute the export
- A second 'finalise' step run via a separate py script to check to see if the
- export is complete, and if so downloads the exported data
- Full documentation: https://assetic.zendesk.com/knowledge/articles/360000716655
- Author: Kevin Wilton (Assetic)
- """
- import assetic
- import os
- # get current working dir. Will reset at end
- cwd = os.getcwd()
- # set output folder location by changing working dir
- os.chdir(r'c:\temp\export_to_csv')
- # Define location of configuration file, log file, log level
- inifile = None
- asseticsdk=assetic.AsseticSDK(inifile, None, "Info")
- sync = assetic.SyncToLocalProcesses()
- profiles = list()
- # Profile for 'Assets Sync'
- profilemapping = assetic.SearchProfileRepresentation()
- profilemapping.profileguid = "d9a350d7-8078-e611-946c-06edd62954d7"
- profilemapping.profilename = "Water Pressure Pipes"
- profilemapping.keyfield = "[Asset Id]"
- profilemapping.background_export = True
- profiles.append(profilemapping)
- # Profile for 'Component Sync'
- profilemapping = assetic.SearchProfileRepresentation()
- profilemapping.profileguid = "65ffd995-4e76-e611-946c-06edd62954d7"
- profilemapping.profilename = "All Components"
- profilemapping.keyfield = "[Component Id]"
- profilemapping.background_export = False
- profiles.append(profilemapping)
- # Initialise the exports.
- for profile in profiles:
- chk = sync.initiate_export_task_to_file(profile, True, 2, 2, 10010,
- False, False)
- print(chk)
- # Change working directory back to initial directory
- os.chdir(cwd)
This process may be scheduled to run on a periodic basis, such as weekly.
3. Finalise export
The Assetic Python SDK method 'SyncToLocalProcesses.finalise_export_task_to_file()' is used to check the status of export tasks, and if the export task is complete, the csv data is downloaded and saved as a CSV file. The 'finalise_export_task_to_file' method is also run as a scheduled process (at a later time to the SyncToLocalProcesses.initiate_export_task_to_file).
The 'SyncToLocalProcesses.finalise_export_task_to_file()' method has the following optional parameter:
hidecontroldoc: a boolean value. Default is False. If True, then once the export process is complete the Assetic document used to manage the export is removed.
Below is a sample script to finalise the export process, which will download the exports triggered via a background worker export task.
- """
- Assetic.BulkExportFinalise.py
- Finalises the bulk export process.
- When this py script is run it checks to see if there are any outstanding bulk
- exports awaiting download. This is indicated by document links in Assetic that
- follow a specific structure
- The Assetic API's are used to check if the export is ready for download and if
- so, the data is downloaded to csv.
- Full documentation: https://assetic.zendesk.com/knowledge/articles/360000716655
- Author: Kevin Wilton (Assetic)
- """
- import assetic
- import os
- # get current working dir. Will reset at end
- cwd = os.getcwd()
- # set output folder location by changing working dir
- os.chdir(r'c:\temp\export_to_csv')
- inifile = None
- asseticsdk=assetic.AsseticSDK(inifile, None, "Info")
- sync = assetic.SyncToLocalProcesses()
- # run process to finalise the exports
- sync.finalise_export_task_to_file(True)
- # reset working directory
- os.chdir(cwd)
How it Works
The export process is the same as for the Export Sync to Database outlined in this article.
The difference is that the CSV export uses Assetic document links to record the progress of the export process rather than using a database table to record the exports in progress.
The Advanced Search for documents can be used to view CSV exports that have been initiated and completed. Add the "External Id" and "Description" fields to the Documents search field list because the "External Id" field contains the Task Id of the background export task and the "Description" field records the number of records processed and DateTime of the most recently changed record if exporting immediately (<10,000 records).
An export process that has been initiated (SyncToLocalProcesses.initiate_export_task_to_file) but not finalised (SyncToLocalProcesses.finalise_export_task_to_file) will have a value for External Id with the same format as the following example:
TaskID=664553b4-8806-4aaa-af67-074d2c013100;DocumentId=Not Processed
In this example a bulk export task (Id="664553b4-8806-4aaa-af67-074d2c013100") has been created, but it is either still in progress, or the SyncToLocalProcesses.finalise_export_task_to_file method is yet to be executed to finalise the csv export process.
The Assetic REST API GET /api/v2/backgroundworker/{id} is used by the python export process to get the status of the background worker export task and if the status is "Complete" the document Id of the export is also returned.
https://[your_site].assetic.net/api/v2/backgroundworker/664553b4-8806-4aaa-af67-074d2c013100
The Assetic REST API GET /api/v2/document/{id} is then used to download the export document which is saved as a CSV file in the python current working directory.
Once the file is downloaded the document link used to record the background export task id is then updated to indicate the file has been downloaded. The updated external Id and description fields of the document link will have the following format:
"ExternalId": "TaskID=4f1cbf0f-2435-4750-a60f-eaebf474057b;DocumentId=14b622e1-847d-4c2a-b444-12ad4fd3406a",
"Description": "Record Count:10710",
The Document Id of the export file is now held in the external Id field, along with the task Id of the background worker process that created the export. The number of exported records is held in the Description property of the document. The Task Id is used to get the DateTime that the export started (not the time it was initiated) and that DateTime is used to search for records with a more recent last modified date.
The CSV file is created immediately (SyncToLocalProcesses.initiate_export_task_to_file) in the event that there were less than 10,000 records returned by the advanced search and if the export process was not initiated with the 'always_background_export' set to 'True'.
In the case of an immediate export, the document link is created with an external Id and Description of the following format:
"ExternalId": "Immediate export via paginated search",
"Description": "Record Count:22; Max_Last_Modified 2021-03-01T12:06:26"